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Abstract 


A  concurrent  system  consists  of  processes  communicating  via  shared  objects,  such 
as  shared  variables,  queues,  etc.  The  concept  of  waiUfreedom  was  introduced  to  cope 
with  process  failures:  each  process  that  accesses  a  wait-free  object  is  guadanteed  to  get 
a  response  even  if  all  the  other  processes  crash.  But  what  if  these  wait-free  objects 
themselves  faU?  For  example,  if  a  wait-free  object  “crashes”,  all  the  processes  that 
access  that  object  are  prevented  from  making  progress.  In  this  paper,  we  introduce  the 
concept  of  fauiUtolerant  wait-free  objects,  and  study  the  problem  of  implementing  them. 
We  give  a  universal  method  to  construct  fault-tolerant  wait-free  objects,  for  all  types 
of  “responsive”  failures  (including  one  in  which  faulty  objects  may  “lie”).  In  sharp  con¬ 
trast,  we  prove  that  many  common  and  interesting  object  types  (such  as  queues,  sets, 
and  test&set)  have  no  fault-tolerant  wait-free  implementations  even  under  the  most 
benign  of  the  “non-responsive”  types  of  failure.  We  also  introduce  several  concepts  and 
techniques  that  are  central  to  the  design  of  fault-tolerant  concurrent  systems:  the  con¬ 
cepts  of  self-implementation  and  graceful  degradation,  and  techniques  to  automatically 
increase  the  fault-tolerance  of  implementations.  We  prove  matching  lower  bounds  on 
the  resource  complexity  of  most  of  our  algorithms. 


’Research  supported  by  NSF  grants  CCR-8901780  and  CCR-9102231,  OARPA/NASA  Ames  grant  NAG- 
2-593,  grants  from  the  IBM  Endicott  Programming  Laboratory  and  Siemens  Corp. 

^  Also  supported  by  an  IBM  graduate  fellowship. 
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1  Introduction 


1.1  Background  and  motivation 

A  concurrent  system  consists  of  processes  communicating  via  shared  objects.  Examples  of 
shared  object  types  include  data  structures  such  as  read/write  register ,  queue ,  set,  and 
tree,  and  synchronization  primitives  such  as  testiCset,  fetchftadd,  and  comparejcswap. 
Even  though  different  processes  may  concurrently  access  a  shzired  object,  the  object  must 
behave  as  if  all  these  accesses  occur  in  some  sequential  order.  More  precisely,  the  behavior 
of  a  shared  object  must  be  Unearizable  ([HW90]).  One  way  to  ensure  linearizability  is  to 
implement  shared  objects  using  critical  sections  [CHP71].  This  approach,  however,  is  not 
fault-tolerant:  The  crash  of  a  process  while  in  the  critical  section  of  a  shztred  object  cam 
permanently  prevent  the  rest  of  the  processes  from  accessing  that  object.  This  lack  of  fault- 
tolerance  led  to  the  concept  of  wait-free  implementations  of  shared  objects.  Informally,  a 
shared  object  is  wait-free  if  every  operation  invocation  on  that  object  is  returned  a  response 
even  if  some  or  all  other  processes  in  the  system  craish. 

Thus,  a  concurrent  system  in  which  all  shared  objects  are  wait-free  is  resilient  to  process 
crashes.  However,  such  a  system  is  not  resilient  to  shared  object  failures.^  For  example, 
the  “crash”  of  a  single  shared  object  stops  all  the  processes  that  need  to  access  that  object. 
Motivated  by  this  observation,  we  study  the  problem  of  implementing  wait-free  shared 
objects  that  are  also  fault-tolerant.  With  such  objects,  the  system  is  guaranteed  to  make 
progress  despite  process  crashes  and  the  failures  of  some  underlying  objects.  To  the  best 
of  our  knowledge,  the  issue  of  fault-tolerant  wait-free  shared  objects  has  not  been  addressed 
before.  (To  simplify  notation,  hereafter  “object”  denotes  a  “shared  object” .) 

1.2  Object  failures 

We  classify  object  failures  into  two  broad  categories:  Responsive  and  non-responsive.  We 
require  that  objects  subject  to  responsive  failures  continue  to  respond  (in  finite  time)  to 
operation  invocations.  The  responses  may  be  incorrect.  In  contrast,  objects  subject  to 
non-responsive  failures  are  exempt  from  responding  to  operation  invocations.  Such  objects 
may  “hang”  on  the  invoking  process. 

We  divide  responsive  failures  into  three  sub-classes:  R-crash,  R-omission,  and  R-arbitrary. 
An  object  subject  to  R-crash  failure  behaves  correctly  until  it  fails,  and  once  it  fails,  it  re¬ 
turns  a  distinguished  response  J.  to  every  invocation.  As  with  R-crash,  an  object  subject  to 
R-omission  failures  may  return  the  correct  response  or  a  i..  However,  even  if  it  responds  ± 
to  a  process  p,  a  subsequent  operation  invocation  by  a  different  process  q  may  get  a  correct 
response.  This  behavior  models  an  object  O  made  of  several  components,  some  of  which 
failed.  The  operation  by  p  “ran  into”  a  failed  component  of  O,  while  the  one  by  q  only 
encountered  correct  components  of  O.  Finally,  objects  subject  to  R-arbitrary  failures  may 
“lie”,  t'.e.,  return  arbitrary  responses  to  operation  invocations. 

'Even  “software”  objects  have  nndeilying  hardware  components.  The  software  and/or  the  hardware 
could  be  faulty. 
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Similarly,  we  divide  non-responaive  failures  into  crash,  omission,  and  arbitrary.  An 
object  subject  to  crash  failure  behaves  correctly  until  it  fails,  and  once  it  fails,  it  never 
responds  to  operation  invocations.  An  object  subject  to  omission  failures  may  fail  to  re¬ 
spond  to  the  invocations  of  an  arbitrary  subset  of  processes,  but  continue  to  respond  to  the 
invocations  of  the  remaining  processes  (forever).  The  behavior  of  an  object  subject  to  an 
cirbitreury  failure  is  completely  unrestricted:  it  may  not  respond  to  an  invocation,  and  even 
if  it  does,  the  response  may  be  arbitrary. 

1.3  Fault-tolerant  objects 

Let  T  be  an  object  type  and  let  C  =  {T\,T2, . .  ■,Tn)  be  a  list  of  object  types.  A  function 
J  :  Ti  X  T2  X  . . .  X  Tn  — *  T  is  an  implementation  of  T  from  Ci£  O  =  I(oi,  02, . . . ,  On)  is  a 
wait-free  object  of  type  T  whenever  o,-  (1  <  t  <  n)  is  a  wait-free  object  of  tsrpe  Ti.  We  call 
O  a  derived  object  (of  I)  and  Oi’s  the  base  objects  of  O.  I  is  t-tolerant  for  a  failure  model 
M.  if  O  behaves  correctly  even  if  a  maximum  of  t  base  objects  of  O  fail  according  to  M. 

The  implementation  J  is  a  self-implementation  if  Ti  =  T2  =  . . .  =  r„  =  T.  In 
other  words,  in  a  self-implementation  the  base  objects  are  required  to  be  of  the  same 
type  as  the  derived  object.  For  example,  consider  the  object  type  2-proces8  queue  (i.e.,  a 
queue  that  can  be  accessed  by  at  most  two  processes).  In  Section  6.3,  we  show  that  (for 
every  t)  there  is  a  t-tolerant  self-implementation  of  2-process  queue  for  R-arbitrary  failures. 
Intuitively,  this  means  that  using  a  set  of  wait-free  2-process  queues,  t  of  which  are  subject 
to  R-arbitrary  failures,  we  can  implement  a  failure-free  wait-free  2-process  queue.  Thus  in 
a  self-implementation  fault-tolerance  is  achieved  through  repUcation. 

1.4  Results 

To  study  whether  a  general  object  type  has  a  t-tolerant  implementation,  we  focus  on  two 
particular  object  types:  consensus^  and  register.  Herlihy  [Her91]  and  Plotkin  [Plo89] 
showed  that  one  can  implement  a  wait-free  object  of  any  type  using  only  consensus  and 
register  objects.  Thus,  if  consensus  and  register  have  t-tolerant  implementations,  then 
every  object  type  has  a  t-tolerant  implementation. 

We  first  study  the  problem  of  tolerating  responsive  failures.  We  give  t-tolerant  self- 
implementations  of  consensus  for  R-crash,  R-omission,  and  R-arbitrary  failmes.  For 
R-crash  and  R-onoission  failures,  our  self-implementation  is  optimal  requiring  only  t  -f  1  base 
consensus  objects  if  t  of  them  may  fail.  For  R-arbitrary  failures,  our  self-implementation  is 
efficient  requiring  0(t  log  t)  base  consensus  objects.  We  also  give  t-tolerant  self-implementations 
of  register  for  R-crash,  R-omission,  and  R-arbitrary  failures.  Combining  the  above  results 
with  [Her91,  Plo89],  we  conclude  that  every  object  type  T  has  a  t-tolerant  implementation 
(from  consensus  and  register)  for  cdl  responsive  models  of  failures.  Moreover,  if  T  im¬ 
plements  consensus  and  register,  then  T  has  a  t-tolerant  self-implementation.  This 

conseiuvs  object  supports  two  opermtions,  propose  0  aad  propose  1,  sad  satisfies  the  following  two 
properties.  An  operation  gets  a  response  v  onlp  if  there  is  some  prior  invocation  of  propose  v.  Further,  the 
response  is  the  same  for  all  invocations  of  both  operations. 
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implies  that  familiar  object  types  such  as  (2-process)  queue,  stack,  testfcset,  letchftadd, 
and  (iV-process)  compareftswap  have  t-tolerant  self-implementations  even  for  R-arbitrsiry 
failures! 

What  about  tolerating  non-responsive  failures?  Unfortunately,  the  results  are  mostly 
negative.  We  show  that  there  is  no  1-tolerant  implementation  of  consensus  even  for  crash 
failures,  the  most  benign  of  the  non-responsive  models  of  failures.^  This  inuuediately  implies 
that  any  object  type  T  that  implements  consensus  (such  as  queue,  stack,  testkset, 
swap,  comparers  wap,  etc.)  has  no  1-tolerant  implementations  for  crash  failures.  In  con¬ 
trast,  we  show  that  register  has  a  t-tolerant  se/f-implementation  even  for  arbitrary  fail¬ 
ures.  In  addition  to  these  universality  and  impossibility  results,  this  paper  contains  the 
following  residts. 

Let  X  be  a  t-tolerant  implementation  for  failure  model  M..  By  definition,  every  derived 
object  of  X  is  guaranteed  to  behave  correctly  even  if  up  to  t  base  objects  fail  according  to 
A4.  But  what  happens  if  more  than  t  base  objects  fail?  In  general,  the  derived  object  may 
experience  a  more  severe  failtire  than  M!  We  say  a  t-tolerant  implementation  for  a  failure 
model  A4  is  gracefully  degrading  if  the  failure  of  more  than  t  base  objects  (according  to 
M)  cannot  cause  the  derived  object  to  experience  a  more  severe  failure  than  M..  From  a 
l-tolerzint  gracefully  degrading  self-implementation  of  any  object  type  T  for  a  failure  model 
M,  we  show  how  to  recursively  construct  a  t-tolerant  self-implementation  of  T  for  M.  This 
provides  a  method  for  automatically  increasing  the  fault-tolerance  of  an  object. 

In  general,  graceful  degradation  increases  the  cost  of  an  implementation.  For  instance, 
consider  t-tolerant  implementations  of  consensus  for  R-omission  failmes.  As  already  men¬ 
tioned,  there  is  such  an  implementation  using  only  t  +  1  base  objects.  However,  this  im¬ 
plementation  is  not  gracefully  degrading.  In  fact,  we  show  that,  in  this  case,  graceful 
degradation  requires  at  least  2t  +  1  base  objects,  and  we  give  a  matching  algorithm. 

We  prove  that  there  is  a  large  class  of  object  types  that  have  no  gracefully  degrading 
implementations  for  R-crash.  Intuitively,  this  means  that  whatever  the  implementation, 
the  failiue  of  the  implemented  object  will  be  more  severe  than  R-crash,  even  if  all  its  base 
objects  can  only  fail  by  R-crash. 

We  study  the  problem  of  translating  severe  failures  into  more  benign  failures  [NT90]. 
In  particular  we  show  that  given  3t  -h  1  (base)  consensus  objects,  at  most  t  of  which  are 
subject  to  R-arbitrary  failures,  we  can  implement  a  (derived)  consensus  object  that  can 
only  fail  by  R-omission.  We  abo  show  that  this  translation  from  R-arbitrary  to  R-omission 
is  resource  optimal. 

We  also  show  that  arbitrary  failures  can  be  viewed  as  having  two  orthogonal  compo¬ 
nents:  omission  and  R-arbitrary.  Specifically,  for  any  object  type  T,  given  any  t-tolerant 
self-implementations  T  and  X"  of  T  for  omission  failures  and  R-arbitrary  failures  respec¬ 
tively,  we  show  how  to  construct  a  t-tolerant  self-implementation  of  T  for  arbitrary  failures. 
This  decomposition  simplifies  the  problem  of  tolerating  arbitrary  failures. 

’Tke  impossibility  of  implementiag  a  fanlt-toleiaat  consensus  object  from  any  finite  list  of  base  objects, 
one  of  wbich  may  crash,  is  shown  using  the  impossibility  of  solTing  the  consensus  problem  among  a  finite 
number  of  processes,  one  of  which  may  crash  [LAA87]. 
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2  Preliminaries 


A  concurrent  system  consists  of  processes  communicating  via  (shared)  objects.  A  process 
interacts  with  2ui  object  by  invoking  an  operation,  and  receiving  a  corresponding  response 
from  the  object.  Processes  may  exhibit  arbitrary  variations  in  their  execution  speeds. 
Further  processes  may  crash.  That  is,  a  process  may  stop  at  any  point  in  its  execution  and 
never  take  any  steps  thereafter. 

An  object  is  specified  by  a  type.  An  object  type  T  is  defined  by  N{T),  OP{T)  and 
A(T),  where  N{T)  is  the  maximum  number  of  processes  that  may  access  an  object  O  of 
type  T,  OP(T)  is  the  set  of  operations  supported  by  O,  and  A{T)  specifies  how  O  behaves 
when  these  operations  axe  applied  sequentially.  For  concreteness,  we  assume  A{T)  is  a 
finite/infinite  state  non-deterministic  automaton  where  some  states  are  designated  as  initial 
states.  There  is  a  transition  from  state  s  to  state  t  labeled  {op,  v)  iff  invoking  the  operation 
op  when  the  object  is  in  state  s  may  leave  the  object  in  state  t,  returning  the  response  v. 
We  say  a  sequential  execution  S  =  (qpi,wi),  (opj.  wj),  •  •  • » (opk,  vt)  from  state  s  is  consistent 
with  T  iff,  viewing  A{T)  as  a  directed  graph  with  states  as  nodes  and  transitions  as  directed 
edges,  there  is  a  directed  path  labeled  3  from  state  s.  Further  S  is  consistent  with  T  if 
there  is  some  initial  state  a  of  T  such  that  3  from  s  is  consistent  with  T. 

Each  process  may  have  at  most  one  pending  invocation  on  any  given  object.  That  is,  a 
process  p  cannot  invoke  an  operation  on  an  object  O  unless  the  previous  operation  of  p  on 
object  O  has  already  received  a  response.  However,  operations  from  different  processes  may 
overlap  on  an  object.  The  sequential  specification  is  therefore  not  sufficient  to  understand 
the  behavior  of  an  object.  We  use  linearizability  defined  by  Herlihy  and  Wing  [HW90] 
as  the  criterion  for  the  correctness  of  an  object.  Informally,  linearizability  requires  every 
operation  execution  to  appear  to  take  effect  instantaneously  at  some  point  in  time  between 
its  invocation  and  response.  We  make  this  more  formal  below. 

Let  C7  be  an  object  shared  by  the  processes  p,-,  i  =  1,N.  Let  Et  be  an  execution  of 
the  concurrent  system  (pi,P2,  •  •  •  ,PiV)0)  up  to  time  t.  Define  H{Et),  the  history  of  the 
execution  Et,  as  follows:  (p{,qp,o,t«,te)  €  H{Et)  iff  process  pi  invokes  operation  op'm.  Et 
at  time  tj,  and  that  operation  completes  at  time  te  returning  the  response  v.  Further, 
{Pi,op,*^^$,<x>)  €  HiEt)  iff  process  pi  invokes  operation  op  in  .Et  at  time  t«,  and  that 
operation  does  not  complete  by  time  t.  We  say  HiEt)  is  linearizable  with  respect  to  type 
T  if  and  only  if  there  exist  a  sequence  3  of  (operation,  response)  pairs  and  a  one-to-one 
correspondence  /  from  'H{Et)  to  3  satisfying  the  following; 

•  <S  is  consistent  with  respect  to  T. 

•  |5|  =  \'H{Et)\,  i.e.,  there  are  exactly  as  many  elements  in  the  sequence  3  as  there  are 
in  the  set  H{Et). 

•  If  q  =  {pi,op,v,tt,te)  6  HiEt)  and  /(»;)  =  3j,  then  3j  =  (op,v).  (Here  3j  denotes 
the  element  of  the  sequence  5.) 

•  If »;  =  (pi,op,*,tt,oo)  €  “HiEt)  and  f(ij)  =  3j,  then  5,-  =  (op,w)  for  some  v  &  Z. 
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•  l£rj'  =  (Pi,  op', 7?"  =  (Pj, op", €  n{Et),  andt;  <  then /(?/')  =  5*, 
and  /(V')  =  «S/  for  some  k  <  1. 


An  object  O  is  of  type  T  if  for  every  <,  and  every  execution  Et  of  the  concurrent  system 
(Pi,P2,  •  •  •  ,P/v,  O)  up  to  time  t,  H{Et)  is  linearizable  with  respect  to  T.  We  say  that  that  T 
is  an  N -process  type,  ii  N  =  N{T).  Any  object  of  an  JV-process  type  is  am  N -process  object. 

Objects  aure  either  primitive  or  derived.  A  primitive  object  is  completely  “external”  to 
the  invoking  process.  In  other  words,  adter  a  process  invokes  an  operation  on  a  primitive 
object,  it  may  simply  wait  for  the  object  to  return  the  response.  In  contrast,  a  derived  object 
O  is  “implemented”  in  software  from  base  objects  (each  one  of  which  is  either  derived 
or  primitive).  Such  an  implementation  provides  a  procedure  kpply{pi,op,0)  (for  each 
op  €  OP{T)  amd  1  <  *  <  N(T))  that  process p,-  must  execute  in  order  to  invoke  an  operation 
on  O  amd  receive  the  corresponding  response  from  O.  Eaudi  step  in  kpply(pi,  op,  O)  is 
either  am  invocation  on  a  base  object  of  O,  or  checking  if  a  baise  object  has  rettimed  a 
response  to  a  previous. invocation^,  or  some  locad  computation. 

We  now  define  wait-fireedom  for  primitive  and  derived  objects.  A  primitive  object  is 
wait-free  if  every  operation  invocation  by  every  process  gets  a  response  in  finite  time.  A 
derived  object  O  is  wait-free  if  Apply(pi, op,  O)  (for  eaudi  op  €  OP{T)  amd  1  <  i  <  N{T)) 
returns  a  response  in  a  finite  number  of  steps,  regatrdless  of  the  execution  speeds  of  the 
remaining  processes.  Unless  mentioned  otherwise,  adl  the  objects  considered  in  this  paper 
are  wait-free. 


3  Models  of  failure 

An  object  is  only  am  abstraction  with  a  multitude  of  possible  implementations.  For  instance, 
it  may  be  implemented  as  a  hairdware  module  in  a  tightly  coupled  multi-processor  system, 
or  as  a  server  maw:hine  in  a  message  passing  distributed  system.  Whatever  the  implementa¬ 
tion,  the  reality  is  that  hardware  components  sometimes  fail,  and  when  this  happens,  the 
implementation  fatils  to  provide  the  intended  abstraction. 

Object  failures  may  lead  to  unsatisfactory  system  behavior.  For  instance,  the  “crash” 
of  an  object  prevents  the  progress  of  all  processes  that  access  the  object.  Similarly,  if  the 
object  returns  “incorrect”  responses,  the  system  behavior  also  becomes  incorrect.  It  is 
therefore  important  to  implement  derived  objects  that  behave  correctly  even  if  some  of  the 
base  objects  of  the  implementation  fail.  The  cost  and  the  complexity  of  such  a  fault-tolerant 
implementation  depends  on  the  failure  model,  i.e.,  the  manner  in  which  a  failed  base  object 
departs  from  its  expected  behavior.  In  this  paper,  we  define  a  spectrum  of  failure  models 
that  fall  into  two  broad  classes:  Responsive  and  non-responsive. 

*Note  that  pi  does  not  “hlock”  for  the  response  from  the  object;  It  only  “polls”  for  the  response,  then 
proceeds  to  the  next  step. 
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3.1  Responsive  models  of  failure 


An  object  subject  to  responsive  failures  responds  to  every  operation  invocation.  The  re¬ 
sponse  is  possibly  incorrect,  but  the  object  never  fails  to  respond.  We  describe  below  three 
increasingly  severe  modeb  of  responsive  failures. 


3.1.1  R-crash 

R-crash  is  the  most  benign  model  of  object  failure.  This  model  is  based  on  the  premise 
that  an  object  detects  when  it  becomes  faulty.  Informally,  an  object  subject  to  R-crash 
behaves  correctly  until  it  fails,  and  once  it  fails,  it  returns  a  distinguished  response  ±  to 
every  operation  invocation.  More  precisely,  an  object  O  of  type  T  subject  to  R-crash  failure 
satisfies  the  following  three  properties.  Let  Et  be  any  execution  of  the  concurrent  s3rstem 
(Pi>P2)  •  •  •  )P7V,  O)  up  to  time  t,  and  HiEt)  be  the  corresponding  history,  as  defined  before. 

1.  O  is  wait-free. 

2.  If  (p,op,±,t„t«)  €  ‘H{Et),  €  H{Et),  and  t*  <  t',,  then  v'  =  X. 

3.  Let  H'{Et)  —  Ti(Et)  -  €  H{Et)}.  Then  H'iEt)  is  linearisable  with 

respect  to  T. 

3.1.2  R-omission 

Suppose  O  is  a  wait-free  object  implemented  from  some  “hardware  components”.  We 
informally  argue  that  O  may  exhibit  a  more  severe  frilure  than  R-crash,  even  if  one  of  its 
“hardware  components”,  say  /,  fails  by  R-crash.  If  a  process  p  executes  an  operation  op 
on  O  that  accesses  /,  /  returns  X  to  p,  causing  p  to  return  X  for  op.  Suppose  a  different 
process  q  later  executes  some  operation  oj/  on  O  and  op'  does  not  require  q  to  access  /. 
Process  q  does  not  “notice”  the  failure  of  /,  and  thus  completes  op'  returning  a  non-X 
response.  This  violates  the  “once  X,  everafter  X”  property  of  R-crash. 

Suppose  that  after  p  gets  X  it  does  not  access  O  again.  To  g,  this  scenario  is  indistin- 
gtiishable  from  one  in  which  p  had  crashed  just  before  accessing  /.  Since  the  implementation 
of  O  from  its  components  is  wait-free,  it  is  designed  to  tolerate  p’s  apparent  crash,  and  the 
non-X  response  to  q  most  be  correct. 

In  view  of  these  considerations^,  we  formalize  the  R-omission  model  of  failure  as  follows. 
An  object  O  of  type  T  subject  to  R-omission  failures  satisfies  the  following  properties. 

1.  O  is  wait-free. 

2.  Let  Et  be  any  execution  of  (pi,p2, . . .  ,Piv,  O)  up  to  time  t  with  the  following  property: 
If  a  process  p{  gets  a  response  X  from  O  for  some  invocation  in  Et,  then  pi  does  not 

'a  fbnaal  jnitificatioa  for  the  R-omiMioa  model  it  gives  is  Sectioa  8. 
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invoke  any  operation  on  O  subsequently  in  Ef  Defining  HiEt)  as  before,  obtain 
H'(Et)  by  replacing  every  tuple  of  the  form  (p,qp,  by  (p,op,*,t„oo).  Then 

H’(,Et)  is  linearizable  with  respect  to  T. 


3.1.3  R-iu'bitrary 

An  object  subject  to  R-arbitrary  failures  is  free  to  return  arbitrary  responses  to  operation 
invocations.  The  only  property  we  require  from  such  an  object  is  that  it  be  wait-&ee. 

3.2  Non-responsive  models  of  failure 

Each  responsive  model  of  failure  has  its  non-responsive  counter-part.  The  difference  lies  in 
the  fact  that  an  object  subject  to  a  non-responsive  failure  model  may  also  fail  to  respond 
to  operation  invocations. 

3.2.1  Crash 

Crash  is  the  most  benign  of  all  non-responsive  models  of  failure.  Informally,  an  object 
subject  to  a  crash  failure  behaves  correctly  until  it  fails,  and  once  it  fails,  it  never  responds 
to  any  operation  invocations.  More  precisely,  an  object  O  of  type  T  subject  to  a  crash 
failure  satisfies  the  following  properties. 

1.  If  in  a  (temporally)  infinite  execution  of  the  concurrent  system  {pi,P2,.--,ps^O),0 
never  responds  to  an  invocation  of  some  process  p,-,  then  the  total  number  of  responses 
from  O  in  that  (temporally)  infinite  execution  is  finite. 

2.  If  .Et  is  any  execution  of  the  concurrent  system  {pi,P2,  • . .  ,pif,  O)  up  to  time  t,  and 
H{Et)  is  the  corresponding  history,  then  'H(Et)  is  linearisable  with  respect  to  T. 

3.2.2  Omission 

Omission  failures  are  more  severe  than  crash.  An  object  subject  to  omission  failures  satisfies 
only  property  2  of  the  crash  model. 

3.2.3  Arbitrary 

An  object  subject  to  arbitrary  failures  is  not  reqiiired  to  satisfy  any  properties  at  all.  Thus 
the  behavior  of  such  an  object  is  completely  unrestricted.  In  particular,  the  object  may 
choose  not  to  respond  to  an  invocation.  Even  if  it  does,  the  response  can  be  arbitrary. 
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4  Definition  of  fault-tolerant  implementations 


Let  T  be  an  object  type  and  let  C  =  iTi,T2,  ■  ,  T„)  be  a  list  of  object  types  (Ti’s  are  not 

necessarily  distinct).  A  function  I  :  Ti  x  Ta  x  . . .  x  Tn  — ►  T  is  an  implementation  of  T 
from  C  d  O  =  J(oi,  b^, . . . ,  On)  is  a  wait-free  object  of  type  T  whenever  Oj  (1  <  i  <  n)  is 
a  wait-free  object  of  type  Tj.  We  call  O  a  derived  object  (ofT)  and  Oj’s  the  base  objects  of 
O.  The  resource  complexity  of  I  is  n,  the  number  of  base  objects  that  make  up  a  derived 
object  of  the  implementation.  I  is  t-tolerant  for  a  failure  model  M  'd  O  behaves  correctly® 
even  if  a  maximum  of  t  base  objects  of  O  fail  according  to  At.  Note  that,  in  general,  if 
more  than  t  base  objects  fail  according  to  At,  (7  may  experience  a  more  severe  failure  than 
At.  We  say  that  X  is  gracefully  degrading  if:  when  base  objects  only  fail  according  to  At, 
O  is  only  subject  to  failures  of  type  At.^ 

The  implementation  J  is  a  self-implementation  d  Ti  =  T2  =  ■ . .  =  Tn  =  T.  In  other 
words,  in  a  self-implementation  the  base  objects  are  required  to  be  of  the  same  type  as  the 
derived  object. 


5  Some  basic  results 

Gracefully  degrading  self-implementations  have  the  desirable  property  that  they  can  be 
composed  recursively  to  realize  any  extent  of  fault-tolerance.  This  is  formalized  in  the 
following  lemma. 

Lemma  5.1  (Booster  Lemma)  If  a  type  T  has  a  t-tolerant  gracefully  degrading  self-implementation 
X  of  resource  complexity  n  for  a  failure  model  Ad,  then  T  has  a  (t^  +  2t)-tolerant  gracefully 
degrading  self-implementation  of  resource  complexity  n^  for  Ad. 

Proof  (sketch)  Let  J  =  A(oi,  02, ...  ,o„)  F(oi,02, ...  ,On)-  Define 

T  =  A(0i,02,...,0„j)F(P(0i,...,0n),P(0n+l....,‘>2n)....>^{0(n-l)n+l».--.0n»))-  Itis 
easy  to  verify  that  T  is  a  gracefully  degrading  Ifi  -I-  2t)-tolerant  self-implementation  of  T 
for  Ad.  □ 

Rec\irsive  application  of  the  booster  lemma  gives  the  following  corollary. 

Corollary  5.1  If  a  type  T  has  a  l-tolerant  gracefully  degrading  self-implementation  of  re¬ 
source  complexity  k  for  a  failure  model  Ad,  then  T  has  a  t-tolerant  gracefully  degrading 
self-implementation  of  resource  complexity  for  M.. 

In  Section  6.1.4,  we  illustrate  how  this  corollary  can  be  applied  to  construct  a  t-tolerant 
self-implementation  of  consensus  for  R-arbitrary  failures. 

Our  next  result  states  that  arbitrary  failures  have  a  responsive  (R-arbitrary)  and  a  non- 
responsive  (omission)  component.  Thus  the  problem  of  tolerating  arbitrary  failures  can  be 

*Thst  if,  O  temainf  wait-free  and  linearisable  with  respect  to  T. 

^Eren  if  all  the  base  objects  of  O  fail! 
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reduced  to  two  strictly  simpler  problems:  tolerating  R-arbitrary  failures  and  tolerating 
omission  fadlures. 


Lemma  5.2  (Decomposability  of  2ubitrary  failures)  A  typeT  has  at-tolerant  self-implementation 
for  arbitrary  failures- if  and  only  ifT  has  a  t-tolerant  self-implementation  I'  for  R-arbitrary 
failures  and  I"  for  omission  failures. 

Proof  (sketch)  The  “only  if’  direction  is  obvious.  To  prove  the  “if’  direction,  suppose 
there  are  J'  =  A(oi,02,...,Om)J’^(oi,02,...,Om)  andl"  =  A(oi,02, . . .  ,o„)  Po(oi,02,  •  •  •  ,On)- 
Define  X  =  A(oi,02) •••  ,On»n) i^o(-FU(oi.- ••  ,Om), •••  •••  »<>„„)).  It  can  be 

verified  that  J  is  a  t-tolerant  self-implementation  of  T  for  arbitrary  failures.  □ 


6  Tolerating  responsive  failures 

To  study  whether  an  arbitrary  object  type  has  a  t-tolerant  implementation,  we  focus  on 
two  particular  object  types:  consensus  and  register.  Herlihy  [Her9l]  and  Plotkin  [Plo89] 
showed  that  one  can  implement  a  wait-free  object  of  any  type  using  only  consensus  and 
register  objects.  Thus,  if  consensus  and  register  have  t-tolerant  implementations,  then 
every  object  type  has  a  t-tolerant  implementation. 

6.1  Fault‘tolerant  implementation  of  consensus 

La  the  following,  we  first  define  the  object  type  Jf-consensus.  We  then  present  a  t-tolerant 
self-implementation  of  N-eons«nsus  that  works  for  both  R-crash  and  R-omission  failtires. 
This  implementation  requires  t  -f  1  base  jV-consensus  objects,  and  is  resource  optimal. 
Following  that,  we  show  how  to  translate  R-arbitrar;'  failures  of  iV-consensus  objects  to 
R.omission  failures.  Our  translation  is  also  proved  to  be  resource  optimal.  Although  the 
above  two  results  can  be  chained  together  to  obtain  a  t-tolerant  self-implementation  of 
N-consansus  for  R-arbitrary  failures,  the  resultant  self-implementation  is  not  resource  effi¬ 
cient:  it  requires  O(t^)  base  consensus  objects.  We  therefore  present  an  alternative  efficient 
self-implementation  of  resource  complexity  0(t  log  t). 


6.1.1  N-consensus  object  type 

The  consensus  problem  for  a  system  of  N  processes  is  defined  as  follows.  Each  process  pi 
is  given  a  binary  input  Vj  initially.  The  consensus  problem  requires  each  correct  process 
to  eventually  reach  the  same  (irrevocable)  decision  value  d  such  that  d  €  {vi,  V2, . . . ,  v^^}. 
The  object  type  N-consensus  is  defined  so  that  an  object  of  this  type  makes  the  consensus 
problem  solvable  in  a  system  of  N  processes. 

N-consensus  is  an  iV-process  type  that  supports  two  operations,  propose  0  and  propose 
1,  and  has  the  follovring  sequential  specification.  If  the  first  operation  invoked  is  propose 
V,  then  every  invocation  (including  the  first)  is  returned  the  response  v.  Together  with 
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linearizability,  this  sequential  specification  implies  that  an  object  O  is  of  type  N-consensus 
iff  it  satisfies  the  following  three  properties; 

•  VzJidity:  O  returns  a  response  u  €  {0,1}  to  an  invocation  (from  process  p)  only  if 
there  is  a  prior  invocation  of  propose  v  on  O  (by  some  process,  possibly  p  itself) . 

•  Agreement:  If  O  returns  t;i,U2  to  two  invocations,  and  ui,U2  €  {0,1},  then  ui  =  V2. 

•  Integrity:  The  response  returned  to  an  invocation  by  O  is  either  0  or  1. 

Let  loc  :=  Propose(p,  v,  O)  denote  that  process  p  invokes  propose  v  on  O  and  stores  the 
response  returned  in  its  local  variable  loc. 

6.1.2  Tolerating  R-crash  and  R-omission  failures 

We  present  a  t-tolerant  self-implementation  of  N-consensus  for  R-omission  failures.  Since 
R-omission  failures  are  strictly  more  severe  than  R-crash,  the  same  implementation  also 
works  for  R-crash  failures. 

A  consensus  object  satisfies  weak  integrity  if  every  response  returned  by  the  object  is 
in  {0,1,1}. 

Proposition  6.1  Any  N -consensus  object  that  fails  by  R-omission  satisfies  validity,  agree¬ 
ment,  and  weak  integrity.  Conversely,  if  a  failed  N -consensus  object  satisfies  validity,  agree¬ 
ment,  and  weak  integrity,  then  the  failure  is  R-omission. 

Proof  Follows  from  the  definitions.  □ 


0\,02,  •  •  • ,  Ot+i  :  N-cons«nsus  objects 

Procedure  Propoa*(p,  Vp,  O)  /*  Vp  €  {0, 1}  */ 
estimatep,  w,  k  :  integer  local  to  p 
begin 

estimatep  :=  Vp 
for  fc  ;=  1  to  t  -t- 1  do 

w  :==  propo8e(p,  estimatep,  O/,) 
if  w  /  1  then  estimatep  :=  w 
retum(  estimatep) 

end 


Figure  1:  t-tolerant  self-implementation  of  N-consensus  for  R-omission 
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Theorem  6.1  Figure  1  gives  at-tolerant  self-implementation  o/N-eonsensus  for  R-omission 
failures.  The  resource  complexity  of  the  implementation  w  i  +  1  and  is  optimal. 

Proof  (Sketch) 

Assume  that  at  most  t  base  objects  fail  by  R-omission.  We  show  below  that  the  derived 
object  C?  is  a  correct  iV-consensus  object. 

1.  O  satisfies  validity:  Using  Proposition  6.1,  and  the  fact  that  p  does  not  change 
estimatep  if  a  base  object  returns  ±,  it  is  easy  to  verily  by  an  induction  on  k  that 
if  estimatep  equals  some  value  u  at  any  point,  then  there  is  a  prior  invocation  (from 
some  process  q)  of  Propos«(q,  u,  O). 

2.  O  satisfies  agreement:  Since  at  most  t  base  objects  fail,  there  is  an  (1  <  k  <  t  H- 1) 
that  does  not  fail.  So  Ok  returns  the  same  response  to  €  {0, 1}  to  every  process  that 
accesses  it.  This  implies  that  for  all  p  that  access  Ok,  estimatep  =  w  when  p  completes 
the  iteration  of  the  loop,  and  due  to  Proposition  6.1,  it  never  changes  thereafter. 

Thus  O  returns  the  same  response  w  to  every  p. 

It  is  obvious  that  O  always  returns  0  or  1,  and  that  O  is  wait-firee. 

Any  t-tolerant  self-implementation  for  R-omission  failures  must  handle  the  case  where 
t  base  objects  fail  (by  R-crash)  initially.  It  is  therefore  obvious  that  the  resource  complexity 
of  t  -f- 1  of  our  self-implementation  is  optimal.  □ 

The  above  (self)  implementation  is  not  gracefully  degrading.  For  instance,  suppose  that 
Vp  =  0  and  =  1,  and  the  t  -I- 1  base  objects  fail  by  R-crash  initially.  It  is  easy  to  see  that 
O  returns  0  to  p  and  1  to  q.  Thus  O  does  not  satisfy  agreement,  and  by  Proposition  6.1, 
the  fail\ire  of  O  is  more  severe  than  R-omission.  In  fact,  we  will  now  show  that  2t  ■+■  1  is 
both  a  lower  and  upper  boimd  on  the  resource  complexity  of  a  t-tolerant  gracefully  degrading 
self-implementation  of  N-coasonsus  for  R-omission^.  The  self-implementation  that  requires 
2t  +  1  base  objects  is  given  in  Figure  2. 

Claim  6.1  Letv  be  the  value  of  estimatep  andV  be  the  value  ofVp  at  the  end  ofk  iterations 
(1  <  k  <  2t  +  1)  of  the  for-loop  o/ Propose  (p,Vp,C?y  in  Figure  2.  Then  v  €  {0,1},  and 
\^[l..k]  contains  only  ±’s  and  v’s. 

Proof  By  an  easy  induction  on  k.  □ 

Theorem  6.2  Figure  2  gives  a  t-tolerant  gracefully  degrading  self-implementation  o/N~consansus 
for  R-omission. 

Proof  Assume  all  failmes  of  base  objects  are  by  R-omission.  We  first  show  that,  even  if 
more  than  t  base  objects  fail,  O  satisfies  validity,  agreement,  and  weak  integrity: 

*  Aa  will  be  abowB  later  ia  Theorem  8.2,  there  is  no  gracefnlly  degrading  implementation  of  l-censansns 
for  R-crash. 
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Oi,  O2, .  •  • ,  02t+i  :  N-cons«nsus  objects 
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Procedure  Propos6(j>,  Vp,  O)  /*  Vp  e  {0,1}  */ 
Vp[\..2t  +  1],  estimatep,  w,  k:  integer  local  to  p 
begin 

estimatCp  :=  Vp 

for  A:  :=  1  to  2t  +  1  do 

w  :=  propo8e(p,  estimatCp,  Ok) 

ViHl  ;=  «. 

if  (to  ^  J.)A(u>  7^  estimatep)  then 
estimaiep  :=  w 

Vp[l...ik-l)]  :=  (±,X....,±) 
if  Vp  has  more  than  t  L’s  then 
retum(±) 

else  retum(  estimatep) 

end 


Figure  2:  t-tolerant  gracefully  de^din^  self-implementation  of  N-consensus  for  R-omission 


1.  O  satisfies  validity:  Using  Proposition  6.1,  and  the  fact  that  a  process  p  does  not 
change  eatimatCp  if  a  base  object  retxims  J.,  it  is  easy  to  verify  by  an  induction  on 
k  that  if  estimatep  equals  some  value  u  at  any  point,  then  there  is  a  prior  invocation 
(from  some  process  g)  of  Propo8«(g,  u,  O). 

2.  O  satisfies  agreement:  Suppose,  for  a  contradiction,  there  exist  two  processes  p  and 
q  such  that  Propos«(p,Vp,C7)  returns  0  and  PTopos8(g,v,,C7)  returns  1.  From  Claim 
6.1,  and  lines  8,  9  of  the  algorithm,  it  follows  that  Vp  has  at  least  t  +  1  O’s  at  the  end 
of  the  execution  of  Propo8«(p,Vp,C?)  and  Vq  has  at  least  t  +  1  I’s  at  the  end  of  the 
execution  of  Propos«(g,  Vq,  O).  This  is  possible  only  if  there  is  a  ib  (1  <  ib  <  2t+l)  such 
that  Propos«(p,  estimatep,  Ofc)  returned  0  and  Propose(g,estimateg,0;k)  returned  1. 
Thus  Ok  does  not  satisfy  agreement.  By  Proportion  6.1,  the  failure  of  Ok  is  not 
R-omission,  a  contradiction. 

3.  O  satisfies  weak  integrity:  Trivial  to  verify. 

4.  O  satisfles  integrity  if  at  most  t  base  objects  fail;  Let  Ojk, ,  O*, , . . . ,  Ojk,  (Jbi  <  k2  < 
...  <  ib/)  be  all  the  correct  base  objects.  Since  at  most  t  fail,  we  have  I  >  t  +  1. 
By  the  integrity  and  agreement  properties  of  Ofc, ,  there  is  a  v  €  (0, 1}  such  that  for 
all  p,  Propo88(p,  estimatep,  Ofcx)  returns  v.  Thus  for  all  p  estimatep  =  v  at  the  end 
of  ki  iterations  of  the  for-loop  in  Proposefy,  Vp,  C7).  Using  this  and  Proposition  6.1, 
it  is  easy  to  verify  that  at  the  end  of  the  execution  of  Propo8«(p,Vp,C7),  V^[ibj]=  v 
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and  estimatCp  =  v  for  all  p  and  for  all  1  <  t  <  1.  This  implies,  by  lines  8,  9  of  the 
algorithm,  that  Propose(p,  Vp,  O)  returns  v. 


From  1,  2,  and  4  above,  we  conclude  that  the  self-implementation  is  t-tolerant  for  R- 
omission.  Prom  1,  2,  and  3  above,  together  with  Proposition  6.1,  we  conclude  that  the 
self-implementation  is  gracefully  degrading  for  R-omission.  □ 

Theorem  6.3  The  resource  complexity  of  any  t-tolerant  gracefully  degrading  implementa¬ 
tion  of  VI- consensus  (N  >2)  for  R-omission  is  at  least  2t  +  \. 

Proof  For  a  contradiction,  assume  that  there  is  a  t-tolerant  gracefully  degrading  imple¬ 
mentation  X  from  C  =  {ri,T2, . . .  ,Tn},  of  N-consensus  for  R-omission,  where  n  <  2f.  Let 
O  =  X(0i,02, . . .  ,0„).  Consider  the  following  interleaving  of  processes  p  and  q. 

Scenario 


1.  Process  p  invokes  Propose(p, 0, 0)  and  executes  the  steps  of  Propose(p,  0, 0)  until  ei¬ 
ther  it  accesses  exactly  t  base  objects  or  it  completes  the  execution  of  Propose(p,  0, 0), 
whichever  is  earlier.  Let  5p  denote  the  set  of  base  objects  accessed  by  p.  Every  base 
object  O  €  Sp  behaves  correctly  to  p’s  invocations.  Note  that  |5p|  <  t. 

2.  Process  q  invokes  and  completes  the  execution  of  Propos«(9, 1,  C7).  Let  Sq  denote  the 
set  of  base  objects  accessed  by  q,  and  Tg  =  Sq  —  Sp.  The  base  objects  behave  as 
follows:  Every  base  object  O  €  5p  accessed  by  q  returns  ±  to  g  and  undergoes  no 
change  in  its  state;  every  base  object  O  €Tq  behaves  correctly  to  g’s  invocations.  So 
g  sees  at  most  |5p|  <  t  failures  of  base  objects. 

3.  Process  p  resumes  execution  (thus  jS'pl  =  t),  and  completes  any  remaining  steps  of 
Propo8«(p,0,  C7).  The  base  objects  behave  as  follows:  Every  O  €  T,  accessed  by  p 
returns  X  to  p;  every  O  ^  Sp  —  Tq  accessed  by  p  behaves  correctly  to  g’s  invocations. 
Note  that  Tq  =  Sq  —  Sp  C  {Oi ,  O2,  •  •  • ,  On}  —  Sp,  and  thus  |r,|  <  n  -  t  <  t.  So  p  sees 
at  most  |7^|  <  t  failures  of  base  objects. 

In  a  scenario  such  as  the  above,  we  assume  that  ail  steps  in  item  k  strictly  precede 
every  step  in  item  h  -)- 1. 

We  make  the  following  conclusions  from  the  above  scenario. 

1.  From  the  characterization  of  how  the  failed  base  objects  behave,  it  is  clear  that  all 
failures  are  by  R-omission.  Since  I  is  gracefully  degrading,  the  failure  of  O  is  no  more 
severe  than  R-omission.  Thxis,  by  Proposition  6.1,  O  satisfies  validity,  agreement,  and 
weak  integrity. 

2.  In  the  scenario  described,  neither  process  “knows”  that  the  other  process  is  also 
running.  Thus,  by  validity  and  weak  integrity,  Propose(p,0,  (7)  must  return  either  0 
or  X,  and  Propo8e(g,  1, 0)  must  return  either  1  or  X. 
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3.  In  the  scenario  described,  neither  process  sees  more  thtin  t  base  object  failures.  Since  J 
is  t-tolerant,  it  follows  that  neither  PToposa(j>,0, 0)  nor  Propo8«(g,  1, 0)  may  return 
J..  Together  with  Conclusion  2,  this  implies  that  Propo8e(p,  0,  C?)  returns  0  and 
Propo86(g,  1, 0)  returns  1.  Thus  object  O  violates  agreement  (required  by  Conclusion 
1).  We  conclude  that  Z  is  not  a  gracefully  degrading  t-tolerant  implementation. 


□ 


6.1.3  Translation  from  R-arbitrsiry  to  R-omission 

A  t-tolerant  translation  from  a  failure  model  M.  to  a  (less  severe)  failure  model  M!  for  object 
type  T  is  a  self-implementation  T:  TxTx...xT—*T  such  that  O  =  Z(oi,02, . . . ,  On) 
fails  according  to  M'  if  a  maximum  of  t  base  objects  of  O  fail  according  to  M  (and  the 
remaining  base  objects  are  correct).  Note  that  if  no  base  objects  fail,  by  definition  of  an 
implementation,  O  does  not  fail  either. 

In  this  section,  we  present  a  t-tolerant  trsinslation  from  R-arbitrziry  to  R-omission  for 
N'consensus.  It  is  easy  to  see  that  this  translation  can  be  used  along  with  the  t-tolerant  self¬ 
implementation  for  R-omission  to  obtain  a  t-tolerant  self-implementation  of  N-consensus 
for  R-arbitrary  failures.  This  is  the  principal  motivation  for  studying  such  a  translation. 
We  will  also  show  that  the  resource  complexity,  3t  -f  1,  of  our  translation  is  optimal. 

Since  a  consensus  object  that  suffers  an  R-arbitrary  failure  may  return  a  non-binary 
response,  we  find  it  convenient  to  define  f  ”propo8«(p,t;,0)  as  in  Figure  3. 


Procedure  f-propo8e(p,  v,0) 
begin 

loc  propo8«(p,  v,0) 
if  /oc  €  {0, 1}  then 
retum(/oc) 
else  retum(O) 

end 


Figure  3:  Filtering  an  arbitrary  response  to  a  binary  response 


Let  O  be  the  derived  object  of  the  translation  in  Figure  4.  The  base  objects  of  O  axe 
A[1 . . .  2t  -f  1],  R[1 . . .  t].  In  the  following  claims,  assume  that  at  most  t  base  objects  stiffer 
R-arbitrary  failures,  and  the  remaining  are  correct. 

Claim  6.2  O  satisfies  weak  integrity.  Further,  if  no  base  object  fails,  O  satisfies  integrity. 


Claim  6.3  O  satisfies  vcdidity. 
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i4[l . . .  2t  +  1],  B[1 . . .  t]  :  wait-free  N-consensus  objects 


1 

2 

3 

4 

5 
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Procedure  Propos«(p,Vp,  O) 

countp[0..1],  w,  i,  belief p  :  integer  local  to  p 
begin 

Phstse  1:  co«ntp[0..1]  :=  (0,0) 
for  i  :=  1  to  2t  -h  1  do 

to  :=  f-propo8e(p,t>p,i4[t]) 
countp\w\  :=  countpfto]  +  1 
Phase  2:  Choose  belie  fp  such  that 

countp[&e/te/p]  >  countplfieltef^. 
for  i  :=  1  to  t  do 

i£  belie  fp  ^  f-propo8e(p,6c/*e/p,B[t]) 
retum(X) 
exit  Propoee 
rettim(be/te/p) 

end 


Figure  4:  t-tolerant  translation  from  R-arbitrary  to  R-omission  for  N-consensus 


Proof  Suppose  O  returns  v  €  {0,1}  to  the  invocation  Propose(p,  Vp,C7)  (from  process 
p).  Then  v  =  belie  fp  (by  line  10),  and  a»mtp[v]  =  countp[beliefp]  >  t  4- 1  (by  line  5).  So 
there  is  at  least  one  correct  base  object  i4[t]  such  that  propose (p,Vp,i4[t])  returned  v.  By 
validity  of  i4[t],  it  follows  that  some  process  q  invoked  propose  (g,v,,A[t])  where  v,  =  v. 
This  implies  that  9  invoked  Propose(g,v,C).  □ 

Claim  6.4  O  satisfies  agreement. 

Proof  Suppose  O  fails  to  satisfy  agreement  by  returning  vi  €  (0, 1}  to  some  process  p,  and 
V2  €  {0, 1}  to  a  different  process  q  where  vi  ^  V2.  O  returns  vi  to  p  implies  vi  =  belie  fp. 
Similarly  vj  =  belie  fq.  Since  oi  7^  vj,  we  have  belie  fp  /  belie  fq.  It  is  easy  to  verify  that 
if  all  of  i4[l . . .  2t  -b  1]  are  correct,  then  belief p  =  belie  fq.  It  follows  that  at  least  one  of 
A[1 . . .  2<  -t- 1]  fails. 

Further  O  returns  oi  to  p  implies  for  all  1  <  t  <  t  propose(p,  belie  fp,  B[t])  returns 
belie  fp  =  vi  to  p.  Similarly,  for  all  1  <  t  <  t  propose(g,  belie  fq,  B[t])  returns  belie  fq  =  V2 
to  q.  Thus  all  t  base  objects  B[1 . . .  f]  fail  by  not  satisfying  agreement.  Thus  counting  the 
failed  i4[t]’s  and  B[i]’s,  we  have  more  than  t  failed  base  objects,  a  contradiction.  □ 

Together  with  Proposition  6.1,  the  above  claims  trivially  imply  the  following  theorem. 
Theorem  6.4  Figure  4  presents  a  t-tolerant  translation  from  R-arbitrary  failures  to  R~ 
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omission  failures  for  N-consansus.  The  resource  complexity  of  the  translation  is  2t  +  1. 


Theorem  6.5  The  resource  complexity  of  any  translation!  from  R-arbitrary  to  R-omission 
for  N-consansus  m  at  least  3<  +  1. 

Proof  For  a  contradiction,  assume  the  resource  complexity  of  J  is  n  <  Zt.  We  prove 
the  theorem  through  a  series  of  claims,  involving  “indistingiiishable”  scenarios.  Let  O  = 
^(oi,02,  •  ■  ■  ,On)-  In  the  following  we  say  a  process  p  touches  a  base  object  Oi  if  during  the 
execution  of  Propose (p,  Vp,  O),  p  executes  proposa(p,  *,  0{). 

Claim  6.5  Suppose  p  crecufes  Propose  ^p,0,  to  completion.  If  all  base  objects  are  cor¬ 
rect,  then  p  touches  at  least  i  +  1  base  objects. 

Proof  Suppose  the  claim  is  false,  and  p  touches  only  o,-, ,  Oj, , . . . ,  Oi^  (m  <  t)  before  exiting 
Proposa(p,0, 0).  Since  all  base  objects  are  correct,  O  satisfies  validity  and  integrity.  Hence 
Proposa(p,  0, 0)  returns  0.  Now  consider  the  following  two  sceneirios. 

Scenario  SI 

1.  p  executes  Propose(p,0, C7)  to  completion  touching  only  . . . ,Oi^  (m  <  t). 

Propose(p,  0, 0)  returns  0. 

2.  q  executes  Propose(g,  1,(7)  to  completion. 

Scenario  S2 

1.  Oi, ,  Oj, , . . . ,  Oi„  fail  amd  behave  as  though  they  are  touched  by  p  exactly  as  in  scenario 
SI.  This  is  possible  since  m  <t. 

2.  q  executes  Propose(9, 1, 0)  to  completion. 

Since  no  base  objects  fail  in  SI,  C7  behaves  correctly.  In  particular,  O  satisfies  integrity  and 
agreement.  Thus  Propose(9, 1, 0)  returns  0  in  SI.  Clearly  SI  S2  (We  write  Si  ssg  S2 
to  denote  that  Scenarios  SI  and  S2  are  indistinguishable  to  process  q).  So  Propos«(9, 1, 0) 
returns  0  in  S2  also,  violating  validity.  By  Proposition  6.1,  this  failtire  of  in  S2  is  not 
R-omission.  Since  fewer  than  t  +  1  base  objects  fail  in  S2,  the  translation  !  is  incorrect,  a 
contradiction.  □ 

Claim  6.6  Consider 
Scsnario  S3 

1.  p  executes  Propose  (p,  0,  (7^  up  to  the  point  where  it  has  exactly  touched  t  base  objects 

Oil  ’  )  •  •  •  >  • 
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2.  q  executes  Pxopos*(q,l,0)  to  completion. 


Then  Propose^?,  1,  Oj  returns  1. 

Proof  Let  S  =  {base  objects  touched  by  q}  -  {ofj ,  o, Oj, }.  Let  Oj^ ,  Oj, , . . . ,  oj^  be 
all  the  base  objects  in  5  arranged  so  that  the  first  invocation  of  q  on  Oj,  is  before  the  first 
invocation  of  q  on  .  Note  that  k  <  n  —  t  <  2t. 

Let  S2'  represent  scenario  S2  when  m  =  t.  Since  fewer  thiui  t  +  \  base  objects  fail  in 
S2',  the  failure  of  O  cannot  be  more  severe  than  R-omission.  Hence,  by  Proposition  6.1, 
O  satisfies  validity  and  weak  integrity  in  S2'.  So  Proposo(9, 1,0)  returns  1  or  X  in  S2'. 
Since  S2'  S3,  we  conclude  Propos«(9, 1, 0)  returns  1  or  X  in  S3.  Further  since  no  base 

object  fails  in  S3,  O  satisfies  integrity  in  S3.  So  Propose(9, 1, 0)  returns  either  0  or  1  in 
S3.  Together  the  above  two  conclusions  imply  the  claim.  □ 

Claim  6.7  Consider 
Scenairio  S4 


1.  p  executes  Propose  (p,  0,0)  up  to  the  point  where  it  has  exactly  touched  t  base  objects 

<^il  >  Oij  >  •  •  •  1  Oft  • 

2.  Let  Oj^ ,Ojj,.. . , Oj^  be  as  defined  above  (note  k  <2t).  q  executes  Propose (q,  1,0)  up 

to  the  point  where  it  has  touched  exactly  {oj^ ,  O), , . . . ,  }. 

3.  p  completes  the  execution  ofPTopQ8a(p,0,O). 

Then  Propose  ^p,  0,0^  returns  0. 

Proof  Consider 
Scenario  S5 

1.  p  executes  Propose(p,  0, 0)  up  to  the  point  where  it  has  exactly  touched  t  base  objects 

Oil  >  Otj  >  •  •  •  )  • 

2.  The  base  objects  Oj^ ,Oj,,..., Oj^_^  fail  and  behave  as  though  they  are  touched  by  q 
exactly  as  in  S4. 

3.  p  completes  the  execution  of  Propose(p,0, 0). 

Since  k  <  2t,  the  number  of  failed  base  objects  in  S5  =  k  -  t  <  t,  and  therefore  (by 
Proposition  6.1)  O  satisfies  validity  and  weak  integrity.  So  Proposs(p,0, 0)  returns  either  0 
or  X  in  S5.  Since  clearly  S4  sip  S6,  Proposs(p,  0, 0)  retxims  either  0  or  X  in  S4  also.  However 
since  no  base  object  fails  in  S4,  O  must  satisfy  integrity  in  S4.  Thus  Proposs(p,  0, 0)  returns 
0  in  S4.  □ 
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Claim  6.8  Consider 
Scenario  S6 

1.  p  executes  Propose  ^p,0, 0)  up  to  the  point  where  it  has  exactly  touched  t  base  objects 

Oil  >  )  •  •  •  >  • 

2.  q  executes  Propose  1,  to  completion,  returning  1,  by  Claim  6.6. 

3.  Let  Oji,Oj,,...,Oj^  be  as  defined  above  (note  k  <  2t).  .o,*}  fail 

and  behave  as  though  they  are  never  touched  by  q. 

4.  p  completes  the  execution  0/ Propose  0, 

Then  Propose  ^p,  0,  returns  0. 

Proof  Since  S5  ASp  S6,  Propose (p,  0,0)  returns  0  in  S6.  □ 

From  the  above  claim,  it  is  clear  that  O  does  not  satisfy  agreement  in  S6.  Hence,  by 
Proposition  6.1,  the  failure  of  O  in  S6  is  more  severe  than  R-omission.  Since  fewer  than 
t  +  1  base  objects  fail  in  S6,  the  translation  I  is  incorrect,  a  contradiction.  This  completes 
the  proof  of  Theorem  6.5.  □ 

6.1.4  Tolerating  R-arbitrary  failures 

Since  N-consonsus  has  a  t-tolerant  self-implementation  for  R-omission  failures,  and  has  a 
t-tolerant  translation  from  R-arbitrary  to  R-omission  failures,  it  follows  that  N-cons*nsus 
has  a  t-tolerant  self-implementation  for  R-arbitrary  failures  also.  However  the  resulting  self¬ 
implementation  is  expensive,  requiring  (3t  +  l)(t  +  1)  base  objects.  Our  Tnain  goal  in  this 
section  is  to  present  a  t-tolerant  self-implementation  for  R-arbitrary  failures  whose  resource 
complexity  is  only  O(tlogt).  This  implementation  employs  the  divide-and-conquer  strategy. 

In  the  following,  we  first  present  the  base  step;  obtaining  a  1-tolerant  self-implementation 
(Figure  5).  This  reqtiires  6  base  consensus  objects,  while  the  above  mentioned  approach 
through  translation  requires  8  base  consensus  objects.  Then  we  show  the  recrirsive  step  of 
obtaining  a  t-tolerant  self-implementation  from  a  t/2-tolerant  self-implementation  (Figure 
6). 

Clsum  8.9  If  at  most  one  ofOi,  0,+i,  and  0,+2  (i  =  1  or  4)  foils,  then  an  execution  e  of 
Acc*ss(p,  Oi,  Oi+i,  Ot+2)  w)  (See  Figure  5)  returns  v  only  if  there  is  some  other  execution 
e'  of  Ace«as(g,  Oi,  Oi+i,  and  Oi+2»  »)  (for  some  q)  that  either  precedes  or  is  concurrent 
with  e. 

Claim  6.10  If  none  ofO,,  Oi+i,  andOi+2  (*  —  1  or  4)  fails,  then,  for  clip  andq,  Ace«8s(p, 

Oi,  Oi+i,  0,>2>  ®p)  returns  the  same  value  as  Access(g,Oi,Oi+i,Oi+2,w,). 

Theorem  6.6  Figure  5  gives  a  1-tolerant  gracefully  degrading  self-implementation  o/N-cons«nsus 
for  R-arbitrary  failures. 
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Oi  :  N-con8*nsus  objects  (1  <  t  <  6) 

Procedure  kcc»ss(p,0u02,03,v) 
countp[0..1],  w:  integer  local  to  p 
begin 

countp[0..1]  :=  (0,0) 
for  t  :=  1  to  3  do 

w  :=  f-propo8e(p,«,0<) 
countp[w\  :=  co«ntp[tB]4-l 
if  countp[0]  >  countp[l]  then 
retum(O) 
else  retum(l) 

end 

Procedure  Propo8e(j),  v,  O) 
begin 

V  :=  icce88(p,0i,02,03,v) 

V  :=  lcce88(p,  04,05,  Os,  v) 
retum(v) 

end 


Figure  5:  Intolerant  self-implementation  of  N-eonsensus  for  R-arbitrary  failures 


Proof  Suppose  that  at  most  one  of  0{  (1  <  t  <  6)  fails.  Then  either  none  of  0i,02,  and 
O3  fails  or  none  of  04,05,  and  Os  fails.  Validity  of  O  follows  from  Claim  6.9.  If  none  of 
04,05,  and  Os  fails,  agreement  of  O  follows  &om  Claim  6.10.  If  none  of  0i,02,  and  O3 
faUs,  agreement  of  O  follows  from  Claims  6.9  and  6.10.  It  is  obvious  that  O  always  returns 
0  or  1,  is  wait-free,  and  gracefully-degrading.  □ 

Given  the  1-tolerant  gracefully  degrading  self-implementation  in  Figure  5,  by  ap¬ 
plying  the  Booster  lemma  (Lemma  5.1)  we  can  obtain  a  t-tolerant  self-implementation 
of  If-cons«n8U8  for  R-arbitrary  failures.  However,  the  resulting  resource  complexity  is 
0(t^osa  ^),  which  is  even  higher  than  the  complexity  of  the  implementation  through  transla¬ 
tion  mentioned  above.  We  therefore  present  below  an  alternative  efficient  recursive  strategy. 

See  Figure  6. 

Theorem  6.7  Figure  6  gives  a  t-tolerant  (gracefully  degrading)  self-implementation  o/N-consensus 
for  R-arbitrary  failures  of  resource  complexity  O(tlogt). 

Proof  We  prove  the  theorem  through  a  series  of  claims.  In  all  of  them  we  assume  that  at 
most  t  base  objects  fail. 
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Ao(l . . .  3<  +  1],  Ai[l . . .  3<  +  1],  B[1 . . .  4<  +  1]  :  (O-tolerant)  M-consensus  objects 
0\  :  N-consensus  object 

C>2  :  -tolerant  N-consensus  object 

Procedure  Propose (p,  t»p,  O) 

co«ntp[0..1],  WitnesaCountp[0..1\^  belie fp^analp,  ana2p,  v'^,  i,  w  ;  integer  local  to  p 
begin 

1  countp[0..1],  IVitnessCountp [0..1]  :=  (0,0, 0,0) 

2  Phase  1:  for  i  ;=  1  to  3t  1  do 

3  w  :=  f-propoae(p,t;p,A„,[i]) 

4  if  tfl  =  «p  then  cottntp[tfp]  ;=  countp[vp]-(-l 

5  Phase  2:  analp  :=  f-propose(p,  Vp,Oi) 

6  Phase  3:  for  t  :=  1  to  4t  -I- 1  do 

7  to  :=  f-propose(p, analp,  B[i]) 

8  WitneaaCountplto]  :=  WitneaaCountplw]-hl 

9  Phase  4:  for  i  :=  1  to  3t  -f  1  do 

10  w  :=  f-propo8e(p,»p,A57[»]) 

11  if  to  =  then  countp[t^]  :=  cinintp[v^]+l 

12  Phase  5:  Choose  beliefp  such  that  Witne8aCorintp[beliefp]  >  WitneaaCountplbeliefp]. 

13  if  WitneaaCountplbeliefp]  >  3t  4- 1  and  countp[6e/te/p]  >  2t  4- 1  then 

14  retum(6e/te/p);  exit  Propose 

15  if  WitneaaCountplbeliefp]  >  2t  4- 1  and  countplbeliefp]  >  f  4- 1  then 

16  Vp  ;=  beliefp 

17  else  Vp  :=  Vp 

18  ana2p  :=  propo8e(p,  Vp,02) 

19  retum(ana2p) 
end 

Figure  6:  Efficient  t-tolerant  self-implementation  of  N-consensus  for  R-arbitrary  failures 
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Claim  6.11  IfOi  fails,  then  O2  does  not  fail. 


Proof  Since  Oi  and  O2  are  derived  objects  of  f^]-tolerant,  and  [^J-tolerant  self¬ 
implementations  of  N-consensus  respectively,  Oi  and  O2  tolerate  up  to  2md  L^J 

failed  base  objects  respectively.  Since  at  most  t  base  objects  fail,  both  Oi  emd  O2  cannot 
fail.  □ 

Claim  6.12  IfOi  does  not  fail,  then  O  satisfies  validity  and  agreement. 

Proof  Suppose  Oi  does  not  fail.  Since  a  correct  Oi  satisfies  agreement,  we  have  anslp  = 
ansl,j  =  t;  for  all  p,  q.  Thus  every  process  proposes  the  same  value  v  to  every  P[*]  in  Phase  3. 
Since  at  most  t  objects  in  B(1 . . .  4t  + 1]  lie  (fail) ,  belie  fp  =  v  and  W itnessCountp[belie  fp]  > 

3t  +  1  (for  every  p). 

Also  by  the  validity  of  Oi,  some  process  q  will  have  invoked  propos«(9,  v,  Oi)  before 
any  process  gets  the  response  v  firom  Oi.  This  implies  that  q  will  have  finished  Phase 
1  before  any  process  begins  Phase  3.  Since  at  most  t  objects  in  . . .  3t  +  1]  may  lie, 
it  follows  that  for  all  p,  countp[v\>  2t  +  1  by  the  end  of  Phase  4  of  p.  Thus  we  have 
Witne8sCoiintp\helief^  >  3t  +  1  and  countj,[be/te/p]  >  2t  + 1  (for  every  p).  Hence  every  p 
decides  v  (the  proposal  of  q)  by  line  14.  □ 

Claim  6.13  IfO\  fails,  O  satisfies  validity  and  agreement. 

Proof  Suppose  Oi  fails.  Then  by  Claim  6.11,  O2  does  not  fail.  We  need  to  consider  two 
cases. 

CASE  1  Suppose  some  process  p  retxims  by  line  14.  This  implies  that  W itnessCountp [belie  fp] 
>  3t  +  1  and  countp[beliefp]  >  2t  -f- 1.  Since  at  most  t  base  objects  may  fail,  it  follows  that 
WitnessCount^lbeliefp]  >  2t  + 1  and  count^[beliefp]  >  t  + 1  (for  every  q).  This  implies,  by 
line  12,  belie  f^^  =  belie  fp,  and  let  val  =  belie  fp.  Since  WitnessCount^[belief^]  >  2t-t- 1  and 
count, [6e/te/,]  >  t  -t- 1  (for  every  q),  either  g  returns  belief^  =  val  by  line  14  and  we  have 
agreement  between  p  and  g,  or  g  sets  v'  to  belie  f,^  =  val  by  line  16.  Thus  every  g  that  does 
not  return  by  line  14  proposes  v'  =  val  on  O2.  Since  O2  does  not  fail,  by  validity  of  O2, 
ans2,  ~  Vq  =  val,  and  g  returns  ans2,  =  val  by  line  19.  Again  we  have  agreement  between 
p  and  g. 

To  see  that  O  satisfies  validity,  note  that  countp[6elie/p]  >  2t  1  implies  that  some 
process  proposed  belie  fp  =  val  on  at  least  t  -t- 1  objects  in  A^iic/,[1 . . .  3t  +  1]. 

CASE  2  Suppose  no  process  returns  by  line  14.  Then  every  g  retxims  ans2,  by  line 
19.  Since  O2  does  not  fail,  we  have  (for  all  p,g)  ans2p  =  ans2,  =  val.  Thus  O  satisfies 
agreement. 

By  the  validity  of  O2,  some  process  p  must  have  proposed  val  to  O2.  That  is  Op  =  val. 

In  the  algorithm,  Vp  equals  either  Vp  or  belie  fp.  If  Vp  =  Vp,  then  clearly  O  satisfies  validity.  If 
Op  =  belie  fp  ^  Vp,  then  p  must  have  executed  line  16.  It  follows  that  cotintp[beliefp]>  t  +  1. 
This  implies,  considering  that  at  most  t  objects  in  A«e/(c/^[l . . .  3t-f-l]  fail,  that  some  process 
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q  proposed  u,  =  belie fp  on  some  object  in  Abeii«/,[1 . . .  3t  + 1].  Thus  val  =  Vp  =  belie fp  =  r, 
and  Vq  is  the  initial  proposal  of  q.  Thus  O  satisfies  validity.  □ 

Cltuxn  6.14  The  resource  complexity  of  the  implementation  in  Figure  4  is  0(t log  f). 

Proof  Denoting  the  resource  complexity  of  the  f-tolerant  (gracefully  degrading)  self¬ 
implementation  of  N-consensus  for  R-arbitrary  failures  by  f{t),  we  have  the  following 
recurrence:  f{t)  =  2f{t/2)  +  2{Zt  +  1)  -I-  (4t  -I- 1)  and  /(I)  =  6.  Hence  the  result.  □ 

To  complete  the  proof  of  Theorem  6.7,  note  that  agreement  and  validity  follow  from 
Claims  6.12  and  6.13.  It  is  obvious  that  the  implementation  is  wait-&ee,  gracefully  degrad¬ 
ing,  and  that  O  satisfies  integrity.  □ 

6.2  Fault-tolerant  implementation  of  register 

The  register  type  supports  two  operations,  read  and  write  v.  The  sequential  specification 
is  simple:  read  returns  the  most  recent  value  written.  Lamport  defined  a  weatker  (non- 
linearizable)  object  known  as  safe  register  [Lam86].  In  the  following,  we  first  show  how  to 
build  a  fault-tolerant  safe  register  from  safe  registers,  some  of  which  may  suffer  R-arbitrary 
failures.  We  then  resort  tc  'he  register  construction  results  in  the  literature  to  show  that 
register  has  a  self-implementation  for  R-arbitrary  failures. 

Lemma  6.1  Using  2<  •+•  1  1-reader,  l-writer  safe  registers,  at  most  t  of  which  may  suffer 
R-arbitrary  failures,  we  can  implement  a  failure-fret  1-reader,  l-writer,  safe  register. 

Proof  (sketch)  To  read  the  safe  register,  the  reader  reads  all  base  registers,  and  returns 
the  majority  response.  If  there  is  no  majority,  it  returns  an  arbitrary  value.  To  write  a 
value  V  into  the  register,  the  writer  writes  v  to  all  base  registers.  It  is  easy  to  verify  that 
the  above  strategy  implements  a  safe  register  that  behaves  correctly  even  if  a  inaTiTniiTn  of 
t  base  registers  suffer  R-arbitrary  failures.  □ 

It  is  possible  to  implement  a  multi-reader,  multi-writer,  atomic  register  using  1-reader, 
l-writer,  safe  registers  [Blo87,  BP87,  CW90,  HV91,  Lam86,  NW87,  Pet83,  PB87,  Sch88, 
SAG87,  Vid88,  Vid89,  VA86].  Thus  we  have  the  following  theorem. 

Theorem  6.8  register  has  a  t-tolerant  self-implementation  for  R-arbitrary  failures. 

6.3  Universality  results 

We  now  describe  how  to  implement  fault-tolerant  wait-free  shared  objects  of  a  generic  type. 
An  object  type  T  is  finite  if  A(T)  has  only  a  finite  number  of  states.  Also  let  N-consensus 
with  reset  be  an  iV-process  object  type  informally  defined  as  follows:  An  object  O  of  this 
t3rpe  behaves  exactly  like  an  object  of  type  N-consensus  with  the  difference  that  O  supports 
an  extra  operation  reset  Applying  “reset”  to  O  will  initialize  O  and  make  it  available  for 
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a  fresh  round  of  consensus.  The  operation  “reset”  is  required  to  work  only  in  the  absence 
of  concurrent  operations^. 

Herlihy  showed  that  every  finite  object  type^“  has  an  implementation  from  (N-consensus 
with  reset,  unbounded  register)  ([Her91]).  The  use  of  unbounded  registers  was  re¬ 
placed  by  boolean  registers  by  Plotkin  ([Plo89]).  Using  Plotkin’s  result,  together  with 
Theorems  6.7  and  6.8,  we  obtain  the  following  corollary. 

Corollary  6.1 

•  Every  finite  object  type  has  a  Utolerant  implementation  from  (N-consensus  with 
reset ,  boolean  register)  for  R-arbitrary  failures. 

•  If  a  finite  object  type  implements  N-consensus  with  reset  and  boolean  register 
then  T  has  a  t-tolerant  seU-implementation  for  R-arbitrary  failures. 


Herlihy’s  construction  can  be  easily  modified  to  yield  a  universal  implementation  from 
(N-consensus  with  reset,  imbotmded  register)  even  for  infinite  object  types.  Thus 
Corollary  6.1  holds  even  if  T  is  an  infinite  object  type,  provided  that  boolean  register  is 
replaced  by  unbounded  register  in  the  statement  of  the  corollary. 

Herlihy  showed  that  queue ,  stack,  testkset,  letchkadd  etc.  implement  2-consensus, 
and  comparekswap  implements  N-consensus  [Her91].  It  is  easy  to  show  that  testkset  and 
comparekswap  implement  boolean  register,  and  queue,  stack,  and  letchkadd  imple¬ 
ment  unbounded  register.  Thus, 

CoroUary  6.2  The  following  object  types  have  t-tolerant  self-implementations  for  R-arbitrary 
failures:  (2-process) q}i*yx9,  stack,  testkset,  iotchkadd,  and  (N -process) compaxoksvwp. 


7  Tolerating  non-responsive  failures 


Unlike  responsive  failures,  non-responsive  failures  are  almost  always  impossible  to  cope 
with.  We  first  show  the  impossibility  of  implementing  a  consensus  object  from  any  finite 
list  of  base  objects,  one  of  which  may  crash.  We  do  so  by  a  reduction  from  the  consensus 
problem  among  a  finite  number  of  processes,  one  of  which  may  crash.  The  latter  problem 
is  known  to  be  unsolvable  [FLP85,  LAA87]. 

Theorem  7.1  There  is  no  1-tolerant  implementation  o/ 2-con8enaus  for  crash  failures. 

'Thetefote  l-eena«na«s  with  raaat  cannot  be  defined  modnlaily  throngh  sequential  specification  and 
lineaxisability. 

‘"An  object  type  T  is  finite  if  A(T),  the  automaton  giving  the  sequential  specification  of  T,  has  only  a 
finite  number  of  states. 
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Proof  Suppose  the  theorem  is  false  and  there  is  a  finite  fist  £  =  {Ti,  ,  T/}  of  object 
types  such  that  there  is  a  1-tolerant  implementation  X  of  2-consensus  from  £  for  crash 
failures. 

Now  consider  the  following  concurrent  system  S  in  which  all  objects  are  registers.  Pro¬ 
cesses  in  S  are  {pi,p2}  U  {^jll  <j<  /},  and  the  registers  are  {decision}  U  {invocation{i,i), 
response{j^i)  11  <  i  <  2, 1  <  j  <  1}.  We  claim  that  the  consensus  problem  is  solvable  in  S 
even  if  at  most  one  process  in  5  may  crash.  The  following  is  the  protocol.  Let  Vi  €  {0, 1} 
be  the  input  of  p,-.  The  idea  is  that  process  (1  <  j  <  1)  simulates  an  object  oj  of  type 
Tj,  and  process  p,-  (i  =  1,2)  simulates  the  execution  of  propose  (v,)  on  the  derived  object 
X(oi, . . .  ,oi).  The  details  are  as  follows. 

Initialize  all  registers  to  ±.  The  process  pi  simulates  the  execution  of  the  proce¬ 
dure  propose  (v,-)  of  the  implementation  X  as  explained  below.  If  propose  (vj)  requires 
Pi  to  invoke  some  operation  op  on  oy,  pi  appends  op  to  the  contents  of  invocation(i,j).  If 
propose  (vj)  requires  pi  to  check  if  a  response  to  some  outstanding  invocation  on  oy  has 
jmdved,  pi  checks  if  a  response  has  been  appended  (by  gy)  to  respon8e{j,i).  If  propose(vj) 

■  qiiires  pi  to  decide  some  value  u,  pi  first  writes  v  in  decision  register,  then  decides  it,  emd 
halts  execution.  Also  pi  periodically  checks  if  the  register  decision  contains  a  v  €  {0, 1}.  If 
so,  it  decides  v  and  halts  execution. 

Process  gy  simulates  the  base  object  oy  as  follows,  gy  checks  the  registers  invocation{l,  j) 
and  invocation{2,  j)  in  a  rotmd-robin  fashion.  When  it  notices  that  some  operation  op  has 
been  appended  to  invocation{i,j),  it  applies  op  to  the  local  copy  of  oy  that  it  maintains 
and  appends  the  corresponding  response  to  response(j,i).  Also  gy  periodically  checks  if  the 
register  decision  contains  a  v  €  {0, 1}.  If  so,  it  decides  v  and  halts  execution. 

It  is  easy  to  verify  that  the  above  protocol  solves  the  consensus  problem  among  the 
I  -f  2  processes  in  5  even  if  at  most  one  of  them  crashes.  To  see  this,  consider  the  following 
cases: 

1.  No  process  crashes:  Since  every  gy,  the  process  simulating  object  oy,  is  correct  and 
propose  (w,)  executed  by  pi  (i  =  1,2)  is  a  wait-free  procedure,  it  follows  that  one  of 
pi  and  p2  or  both  eventueilly  write  a  value  u  €  {0, 1}  into  decision.  Thus  every  correct 
process  eventually  decides  v. 

2.  Pi  crashes:  By  our  assumption  that  at  most  one  process  crashes,  process  p2  and  gy 
(1  <  7  <  /),  the  process  simulating  object  oy,  are  all  correct.  Together  with  the  fact 
that  propos«(v2)  is  a  wait-free  procedure,  this  imphes  that  p2  eventually  writes  a 
decision  value  v  into  decision  and  decides  v.  Every  other  correct  process  eventually 
observes  v  in  decision  and  decides  v. 

3.  p2  crashes:  By  a  symmetric  ugument. 

4.  gjt  crashes  (for  some  1  <  fc  <  /):  This  corresponds  to  the  crash  of  the  simulated  base 
object  Ok-  Since  X  is  1-tolerant,  the  execution  of  proposs(vj)  by  process  pi  (t  =  1, 2) 
eventually  terminates.  Thus  one  of  pi  and  p2  or  both  write  a  value  v  into  decision. 
Thus  every  correct  process  eventually  decides  v. 
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In  all  the  above  cases,  since  J  is  an  implementation  of  2-consensus  the  following  holds: 
if  both  Pi  and  p2  write  into  the  decision  register,  then  they  both  write  the  same  value,  2md 
this  value  is  either  vi  or  V2. 

We  showed  that  we  can  use  I  to  solve  the  consensus  problem  in  system  5,  and  this 
contradicts  the  impossibility  result  of  Louis  and  Abu-Amara  [LAA87].  □ 

We  can  strengthen  the  above  result  as  follows.  Suppose  that  at  most  one  base  object 
may  fail,  and  it  can  only  do  so  by  being  “unfair”  (i.e.,  by  not  responding)  to  at  most  one 
process.  Furthermore,  suppose,  the  identity  of  this  process  is  a  priori  “common  knowledge” 
among  all  the  processes.  Even  with  this  extremely  weak  model  of  object  failure,  called 
1  -unfairness  to  a  known  process,  we  can  prove  the  following; 

Theorem  7.2  There  is  no  1-tolerant  implementation  o/ 2-consensus  for  1-unfaimess  to 
a  known  process. 

Proof  (Sketch)  Assume  the  theorem  is  false,  namely,  there  is  a  1-tolerant  implementation 
of  2-consensus  for  1-unfaimess  to  process  pi.  Now  proceed  as  in  the  proof  of  Theorem  7.1. 
Cases  1,  2,  and  3  stiU  hold.  Consider  Case  4,  where  qi,  crashes  (for  some  1  <  k  <  {)•  This 
corresponds  to  the  crash  of  the  simulated  base  object  o^.  This  object  is  now  potentially 
unfair  to  hothp\  and  P2.  But  I  tolerates  unfairness  to  only  pi.  We  circumvent  this  difficulty 
by  modifying  p^’s  protocol  as  follows.  If  propose  (1/2)  requires  p2  to  invoke  some  operation 
op  on  some  Oj,  p2  appends  op  to  the  contents  of  invocation{2,j),  as  before,  but  now  it  also 
waits  until  a  corresponding  response  is  appended  to  response(j,  2)  (by  process  Thus, 
if  p2  attempts  to  access  Ok  after  the  crash  of  it  will  simply  wait  for  the  response  forever. 
Therefore,  at  worst,  the  crash  of  qi,  looks  like  Oj^  is  unfair  to  pi,  and  p2  is  extremely  slow. 
Since  I  tolerates  the  unfairness  of  one  base  object  to  pj ,  J(oi , . . . ,  o/)  continues  to  behave  eis 
a  wait-&ee  consensus  object.  Hence  the  procedure  propos«(vi)  executed  by  pi  eventually 
terminates  returning  the  decision  value.  As  before,  this  value  is  written  into  decision,  and 
eventually  every  correct  process  decides.  Again,  we  have  a  contradiction  to  the  impossibility 
resiilt  in  [LAA87].  □ 

Let  C  be  the  class  of  all  object  types  that  can  implement  2-consensus.  From  the  above 
two  theorems  we  have 

Corollary  7.1  For  all  T  &  C,  there  is  no  1-tolerant  implementation  of  T  for  crash  or 
1-unfaimesM  to  a  known  process. 

From  [Her91]  and  this  corollary,  we  conclude  that  Qusus,  Stack,  TastkSat,  FatchkAdd, 
ComparakSvap,  and  several  other  common  types  do  not  have  a  1-tolerant  implementation 
for  crash  or  1-unfaimess  to  a  known  process.  In  contrast  to  the  above  impossibility  results 
we  show 

Theorem  7.3  register  has  a  t-tolerant  self-implementation  for  arbitrary  failures. 

‘*It  is  easy  to  see  that  with  this  modification  Cases  1,  2,  and  3  still  hold. 


26 


This  follows  from 


Lemma  7.1  Using  5f  +  1  1 -reader,  1-writer  safe  registers,  at  most  t  of  which  may  suffer 
arbitrary  failures,  we  can  implement  a  failure-free  1-reader,  1-writer,  safe  register. 

Proof  (Sketch)  Informally,  the  reader  invokes  ‘read’  on  cdl  registers  (on  which  it  hais  no 
pending  invocation)  and  waits  tmtil  4t  +  1  respond.  It  then  returns  the  majority  vedue.  If 
there  is  no  majority,  it  returns  an  arbitrary  value.  The  writer  writes  to  all  registers  (on 
which  it  has  no  pending  write) .  It  waits  until  4t  + 1  of  them  return  a  “operation  completed” 
response.  It  is  easy  to  verify  that  the  above  strategy  implements  a  safe  register  that  works 
correctly  even  if  a  maximum  of  t  base  registers  suffer  arbitrary  failures.  □ 


8  Other  basic  results 

Consider  a  system  that  supports  a  given  set  H  of  primitive  hardware  objects.  Assume  that 
these  objects  may  fail,  but  if  they  do,  they  are  guaranteed  to  only  i.ail  by  R-crash.  Suppose 
we  wish  to  build  an  object  O  using  only  objects  in  H,  emd  O  is  only  reqiiired  to  function 
correctly  in  the  absence  of  failures.  However,  when  objects  in  H  fail  by  R-crash,  we  would 
like  O  to  fail  only  by  R-crash.  This  last  reqxiirement  is  desirable  for  two  reasons: 

•  The  simple  “once  -L,  everafter  J.”  property  of  R-crash  is  the  most  benign  type  of 
failure. 

•  Such  am  object  O  appeiura  like  any  other  primitive  hardware  object  of  the  system: 
With  O,  the  system  would  be  no  different,  in  functionadity  and  failure  semantics, 
from  one  that  supports  H  U  {C7}  au  its  primitive  hardware  objects. 

In  our  terminology,  a  (0-tolerant)  graM;efully  degrauiing  implementation  is  exau:tly  what 
we  atfe  looking  for.  The  existence  of  such  am  implementation  depends  on  the  type  of  O  amd 
the  types  of  the  objects  in  H.  Unforttmately,  as  we  show  below,  most  objects  do  not  have 
such  implementations  even  when  H  includes  very  powerful  objects. 

An  object  type  T  is  order- sensitive  if  it  is  a  deterministic  iV-process  type  {N  >  2)  amd 
the  following  holds:  There  exist  state  5  in  A(T),  operations  op,  op'  (not  necessairily  distinct) 
in  OP{T),  amd  values  u,v,u',v'  such  that  eauii  of  (op, u),(qp',u')  amd  (op',u'),(op,v)  is  a 
sequential  execution  firom  state  5  consistent  with  T,  and  u  ^  v  amd  u'  ^  v’.  Queue  is  am 
exaunple  of  am  order-sensitive  object  type.  To  see  this,  instantiate  5  to  the  state  in  which 
there  are  two  elements  5  amd  10  in  the  queue  (5  in  the  front),  and  both  op  and  op'  to  deq. 
Now  we  have  u  =  5,  u'  =  10,  v'  =  5,  amd  v  =  10.  Thus  u  ^  v  and  u'  ^  v',  as  required. 
Stack,  TestkSet ,  ComparekSvap  are  some  other  exaunples  of  order-sensitive  object  types. 
An  object  type  is  non  order- sensitive  if  it  is  deterministic  amd  not  order-sensitive.  Exaunples 
of  non  order-sensitive  types  include  register,  sticky  bit,  move,  amd  swap. 

Theorem  8.1  Then  is  no  (0-tolerant)  gracefully  degrading  implementation  of  any  order- 
sensitive  object  type  for  R-crash  from  any  list  of  non  order-sensitive  object  types. 
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Proof  Omitted.  □ 

Preserving  the  failures  semantics  of  the  underlying  system  is  a  highly  desirable  property 
of  an  implementation.  For  R-crash,  the  above  theorem  shows  that  this  property  is  not 
achievable  in  many  cases:  implementations  necessarily  amplify  the  severity  of  the  R-crash 
failures  of  the  underlying  system.  For  example,  consider  a  system  that  supports  registers 
and  sticky  bits  in  “hardware”.  In  such  a  system  any  object  can  be  implemented  [Plo89], 
including  (for  example)  queues.  Assume  the  given  registers  and  sticky  bits  only  fail  by 
R-crash.  Can  we  implement  a  queue  that  also  fails  by  R-crash?  The  above  theorem  shows 
that  this  cannot  be  done! 

Requiring  a  derived  object  to  inherit  the  R-crash  semantics  of  its  base  objects  is  even 
more  difficult  if  add  the  requirement  that  the  derived  object  be  1-tolerant.  Even  if  we  do 
not  restrict  the  types  of  primitives  available  in  the  tmderlying  system,  such  implementations 
do  not  exist  for  most  objects  of  interest!  This  is  shown  by  the  theorem  below. 

Theorem  8.2  Thert  is  no  1-tolerant  gracefully  degrading  implementation  of  any  order- 
sensitive  object  type  for  R-crash. 

Proof  For  a  contradiction,  assume  C  =  {ri,T2,...,T„}  is  a  list  of  types  such  that 
there  is  a  1-tolerant  gracefully  degrading  implementation  X  of  T  from  C  for  R-crash.  We 
prove  the  theorem  through  a  series  of  claims,  involving  “indistinguishable”  scenarios.  Let 
O  =  I{0i,02,. .  .On),  and  op,  op',  S,  u,  v,  u',  v'  be  as  given  in  the  definition  of  order- 
sensitive  types. 

Claim  8.1  Suppose  O  is  in  state  S,  and  processes  p  and  q  execute  ipplj  (p,  op,  O J  and 
hpp^7(qyOp',0)  respectively.  For  any  interleaving  of  Lpp\j(p,op,0)  and  Apply (g,  op',  O), 
either  Apply (p, op, O)  returns  u  and  kpplj(q,op' ,0)  returns  u'  or  kpplj(p,op,0)  returns 
V  and  kpplj (q,  op' ,  O )  returns  v'. 

Proof  In  the  linearization  of  the  execution  history,  either  Apply(p,  op,  O)  precedes  Apply  (g,  op',0) 
or  Apply(g,  op',  O)  precedes  Apply(p,  op,  O).  This,  together  with  the  definitions  of  u,  u',  v,v', 
and  the  fact  that  T  is  a  deterministic  type,  trivially  imply  the  claim.  □ 

Claim  8.2  There  exists  a  sequence  a  of  steps  (of  p)  and  a  step  s  (of  p)  such  that  the 
following  Scenarios  SI  and  S2  are  possible. 

Scenario  SI  (scenario  starts  with  O  in  state  S) 

1.  Process  p  initiates  and  partially  executes  kpplj(p,op,0)  by  completing  the  steps  in 
a. 

2.  Process  q  initiates  and  completes  (all  the  steps  of)  kpplj(q,op',0),  returning  v' . 

3.  p  completes  the  remaining  steps  o/ Apply  (p,  op,  Oj,  returning  v. 

Scenario  S2  (scenario  starts  with  O  in  state  S) 
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1.  p  initiates  and  (partially)  executes  Apply  (p,  op,  0)  by  completing  the  steps  in  a.s. 

2.  q  initiates  and  completes  (all  the  steps  of)  Apply (q, op' ,0),  returning  u' . 

3.  p  completes  the  remaining  steps  o/ Apply  fp,  op,  Oj,  returning  u. 


Proof  Clearly  if  process  p  executes  no  steps  of  Apply(p,  op,  O)  before  process  q  initiates 
and  completes  Apply(g,  op',  O),  then  Apply(q,  op',  O)  must  return  v'.  Further  if  p  initiates 
and  completes  all  the  steps  of  Apply(p,op,0)  (let  /3  be  this  sequence  of  steps)  before  g 
initiates  and  completes  Apply(g,  op',  <!?),  then  Apply (q,  op',  O)  must  return  u'.  Together 
with  Claim  8.1  by  which  Apply(g,  op',  O)  must  return  either  u’  or  v',  the  above  implies  that 
there  exists  a  sequence  a  of  steps  and  a  step  s  such  that  a.s  is  a  prefix  of  0  for  which  the 
claim  holds.  □ 

Herezifter  we  will  assume  Ofc  is  the  base  object  accessed  by  p  in  step  s. 

Claim  8.3  Consider 

Scenario  S3  (scenario  starts  with  O  in  state  S) 

1.  p  initiates  and  (partially)  executes  Apply (p,  op,  O)  by  completing  the  steps  in  a.s. 

2.  q  initiates  and  completes  (all  the  steps  of)  Apply (q,  op',  O),  returning  u'  (as  in  S2). 
3-  Oi,  O2, . . . ,  On  fail  by  R-crash. 

4.  p  completes  the  remaining  steps  of  Apply  (p,  op  O). 

Then  Apply  (p,  op,  O)  returns  u. 

Proof  Suppose  Apply(p,  op,  O)  returns  X.  Since  J  is  gracefully  degrading,  the  failure 
of  O  must  appear  like  R-crash.  This  requires,  given  that  Appl7(g,  op',  O)  returns  a  uon-X 
response,  that  Apply(g,  op',  O)  precede  Apply(p,  op,  O)  in  the  linearization  order.  Doing 
so,  however,  implies  that  (op',u')  is  a  sequential  execution  &om  S  consistent  with  T.  This 
cannot  be  true  since  u'  ^  v',  T  is  deterministic,  and  (op',v')  is  a  sequential  execution  &om 
5  consistent  with  T.  Thus  Apply (p,  pp,  O)  cannot  return  X. 

Suppose  Apply(p,  op,  O)  returns  w  where  X  ^  w  ^  u.  Since  in  the  linearization, 
either  Appl7(p,  op,  O)  precedes  Apply(g,  op',  O)  or  Apply(g,  op',  O)  precedes  Apply(p,  op,  O), 
it  follows  that  either  (op, to), (op', u')  or  (pp',tt'),(op,to)  is  a  sequential  execution  from  5 
consistent  with  T.  This  cannot  be  true  since  T  is  deterministic  and  (op,u),(op',u')  and 
(pp',  t;'),(pp,  v)  are  sequential  executions  from  5  consistent  with  T  and  w  yA  u,  u'  yA  v'. 

We  conclude  that  Apply(p,op,  O)  must  return  u.  □ 

Claim  8.4  Consider 

Scenario  S4  (scenario  starts  with  O  in  state  S) 
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1.  p  initiates  and  (partially)  executes  l^pply(p,op,OJ  by  completing  the  steps  m  a.s. 

2.  Ok  fails  by  R-crash. 

3.  q  initiates  and  completes  (all  the  steps  of)  kpply(q,op' ,0). 

4.  Oi, . . . , Ok-i  and  Ofc+i, . . . , On  also  fail  by  R-crash. 

5.  p  completes  the  remaining  steps  of  kpplj(p,op,0). 

Then  Apply  (p, op,  O)  returns  u  and  Apply  (q,  op'  ,0)  returns  u'. 

Proof  Clearly  S4sSpS3.  Therefore,  as  in  S3,  Appl7(p,  op,  O)  returns  u  in  S4.  Since  J  is  1- 
tolerant,  and  since  only  Ok  has  failed  by  the  completion  of  Apply(9,  op',  O),  Apply(g,  op',  O) 
must  return  a  non-X  response.  From  the  definitions  of  u,u',v,v',  it  is  easy  to  verify  that, 
the  only  non-X  response  that  satisfies  linearizability  is  u'.  □ 

Claim  8.5  Consider 

Scenario  SS  ( scenario  starts  with  O  in  state  S ) 

1.  p  initiates  and  partially  executes  Apply (p,op,0)  by  completing  the  steps  in  a. 

2.  Ok  fails  by  R-crash. 

3.  q  initiates  and  completes  (all  the  steps  of)  Apply  (q,  op' ,0). 

4.  Oi, . . . ,  Ok-i  and  Ofc+i, . . . , On  also  fail  by  R-crash. 

5.  p  completes  the  remaining  steps  0/ Apply  fp,  op, 

ITien  Apply (p,op,0)  returns  u. 

Proof  Clearly  S5»;,S4.  Therefore  Apply(q,  op',  O)  returns  u'  as  in  S4.  By  similar  argu¬ 
ments  as  in  Claim  8.3,  it  can  be  shown  that  Apply(p,  op,  O)  returns  u.  □ 

Claim  8.6  Consider 

Scenario  S6  (scenario  starts  with  O  in  state  S) 

1.  p  initiates  and  partially  executes  Apply(p,op,0)  by  completing  the  steps  in  a. 

2.  q  initiates  and  completes  (all  the  steps  of)  Apply  (q,  opl  ,0). 

3.  All  base  objects  Oi ,  O3, . . . ,  On  fail  by  R-crash. 

4.  p  completes  the  remaining  steps  of  Apply(p,op,0). 

Then  Apply(p,  op,  O)  returns  u,  and  Apply (q,  op',  O)  returns  v'. 


30 


Proof  Since  S6  ssp  S5,  Apply(p,  op,  O)  returns  u  as  in  S5.  Since  S6  SI,  Apply(g,  op’,  O) 
returns  u'  as  in  SI.  □ 

Neither  (op,u),(op',v')  nor  (op',v'),(op,u)  is  a  sequential  execution  from  5  consistent 
with  T.  Hence  the  execution  in  Claim  8.6  is  not  linearizable.  Thus  the  failure  of  C?  in  S6  is 
more  severe  than  R-crash.  We  conclude  that  X  is  not  a  gracefully  degrading  implementation 
for  R-crash,  a  contradiction  which  concludes  the  proof  of  Theorem  8.2.  □ 

The  above  discussion  raises  some  questions  on  the  “practicality”  of  the  R-crash  model: 
Even  if  “h^lrdware”  objects  fail  by  R-crash,  “software”  objects  don’t.  The  R-omission  model 
defined  in  this  paper  does  not  have  this  serious  limitation.  In  fact,  for  any  t  >  0  every 
object  type  has  a  t-tolerant  gracefully  degrading  implementation  from  (universal  type , 
register)  for  R-omission.  In  other  words,  implementations  preserving  the  R-omission 
semantics  of  the  underlying  system  always  exist.  This  is  a  formal  justification  for  adopting 
the  R-omission  model  of  failure. 
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